AITopics | explainable semantic space

Collaborating Authors

explainable semantic space

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Explainable Semantic Space by Grounding Language to Vision with Cross-Modal Contrastive Learning

Neural Information Processing SystemsDec-24-2025, 13:47:22 GMT

In natural language processing, most models try to learn semantic representations merely from texts. The learned representations encode the "distributional semantics" but fail to connect to any knowledge about the physical world. In contrast, humans learn language by grounding concepts in perception and action and the brain encodes "grounded semantics" for cognition. Inspired by this notion and recent work in vision-language learning, we design a two-stream model for grounding language learning in vision. The model includes a VGG-based visual stream and a Bert-based language stream. The two streams merge into a joint representational space. Through cross-modal contrastive learning, the model first learns to align visual and language representations with the MS COCO dataset. The model further learns to retrieve visual objects with language queries through a cross-modal attention module and to infer the visual relations between the retrieved objects through a bilinear operator with the Visual Genome dataset. After training, the model's language stream is a stand-alone language model capable of embedding concepts in a visually grounded semantic space.

cross-modal contrastive learning, explainable semantic space, grounding language, (9 more...)

Neural Information Processing Systems

Industry: Education > Curriculum > Subject-Specific Education (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Explainable Semantic Space by Grounding Language to Vision with Cross-Modal Contrastive Learning

Neural Information Processing SystemsJan-18-2025, 00:24:32 GMT

cross-modal contrastive learning, explainable semantic space, grounding language, (5 more...)

Neural Information Processing Systems

Industry: Education > Curriculum > Subject-Specific Education (0.52)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback